fix(toml): swap uiri/toml encoder for tomli_w (issue #439 residual)#566
Conversation
The uiri/toml encoder raises IndexError on strings containing certain characters (notably real control characters like chr(27)/ANSI escape). Issue #439 identified this class of bugs and proposed switching to tomli-w. Decode was migrated to stdlib tomllib on 3.11+ previously, but encode still routed through toml.dumps and still crashed. Reproducer on main: >>> benedict({"color": "\033[31m"}).to_toml() IndexError: list index out of range Change: TOMLSerializer.encode() now calls tomli_w.dumps(). Decode path untouched (tomllib on 3.11+, toml on 3.10). toml stays in the [toml] extra guarded by python_version < '3.11' for the 3.10 decode fallback; tomli-w is added unconditionally for encode. Regression tests cover: - ANSI control character (chr(27)) encode + round-trip — was crashing - Issue #439's literal-backslash examples — guard against regression - Round-trip on 7 tricky values (control chars, tabs, unicode, quotes) - Nested dict with embedded control chars - Direct serializer encode/decode path tests/serializers/test_toml_serializer.py replaces the prior TODO stubs with 5 real tests. test_io_dict_toml's "extra not installed" test patches tomli_w_installed (the encode dependency) instead of toml_installed. API note: tomli_w.dumps kwargs differ from toml.dumps (no `encoder=` param; gains `multiline_strings` and `indent`). Callers of `.to_toml(**kwargs)` passing uiri-specific kwargs will hit TypeError and should migrate to tomli-w's kwarg surface. Full suite: 800 tests pass (1 pre-existing skip). Fixes the encode-side failure mode documented in #439. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes TOML encoding crashes (notably on real control characters like ESC/ANSI sequences) by switching the encode path from uiri/toml to tomli-w, while keeping the existing decode behavior (tomllib on Python 3.11+, toml on 3.10).
Changes:
- Update
TOMLSerializer.encode()to usetomli_w.dumps()and requiretomli-wat runtime for encoding. - Adjust TOML optional-dependency configuration to add
tomli-wand gatetomlto Python < 3.11. - Replace prior TOML serializer TODO tests with concrete regression/round-trip coverage and update the “extra not installed” patch target accordingly.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
benedict/serializers/toml.py |
Swap TOML encoding implementation to tomli-w and add optional import/installed flag. |
pyproject.toml |
Update [project.optional-dependencies].toml to include tomli-w and conditionally include toml for Python < 3.11. |
requirements.txt |
Add tomli-w to the pinned dev/test requirements set. |
tests/serializers/test_toml_serializer.py |
Add regression and round-trip tests covering control characters and other tricky strings. |
tests/dicts/io/test_io_dict_toml.py |
Patch tomli_w_installed (encode dependency) for the “extra not installed” test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_decode_toml(self) -> None: | ||
| # TODO | ||
| pass | ||
| def test_encode_ansi_control_character(self): |
There was a problem hiding this comment.
Test methods in this file omit the -> None return type annotation, while the other serializer test modules consistently include it (e.g., tests/serializers/test_abstract_serializer.py:12). For consistency (and to keep type-checking expectations uniform), please add -> None to these new test methods.
| def test_encode_ansi_control_character(self): | ||
| """Scenario 1 — falsification clause #1. | ||
|
|
||
| `benedict({"color": "\033[31m"}).to_toml()` must not raise. On |
There was a problem hiding this comment.
The docstring includes "\033[31m" inside a Python string literal, so \033 will be interpreted and the docstring will contain a real ESC control character at runtime. This can lead to invisible/control characters in test output and makes the docstring harder to read/copy. Consider escaping it as \\033[31m (or writing it as \x1b[31m) in the docstring example so the docstring remains printable text.
| `benedict({"color": "\033[31m"}).to_toml()` must not raise. On | |
| `benedict({"color": "\\033[31m"}).to_toml()` must not raise. On |
- Add `-> None` return annotation to all new test methods to match the convention in other tests/serializers/ files. - Escape `\033` as `\\033` in the test_encode_ansi_control_character docstring so the docstring renders as printable text. No behavior change; pure consistency polish. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@copilot apply changes based on the comments in this thread |
There was a problem hiding this comment.
@ricaskew thanks for the PR, the crash is confirmed and the fix is good.
I have one concern: we are still using uiri/toml for decode on Python 3.10, but this library is abandoned and has known bugs. Since we are already adding tomli_w (from hukkin), I think it is better to finish the migration and remove uiri/toml completely.
My proposal:
decode->tomli(on Python 3.10 only) +tomllib(stdlib on Python 3.11+)encode->tomli_w(already in this PR)
So the dependencies would be:
- on Python 3.10: only
tomli+tomli-w - on Python 3.11+: only stdlib +
tomli-w
Can you update the PR in this way? I'm happy to review again after.
PS. remember to run pre-commit.
Per maintainer review (CHANGES_REQUESTED, fabiocaccamo): replace abandoned uiri/toml dependency with tomli (Python 3.10) + tomllib (Python 3.11+). Encode side stays on tomli_w (no change). Net: uiri/toml fully removed. Fix mypy CI gate: - benedict/serializers/toml.py:52 — explicit str annotation on tomli_w.dumps() return value to satisfy [no-any-return] check. Auxiliary test file rename + typo fix: - tests/dicts/io/test_io_dict_toml.py — patched flag renamed toml_installed -> tomli_installed; "tomlib" -> "tomllib" in skip-message. Verified locally: pre-commit clean (9/9 hooks pass including mypy); pytest tests/serializers/test_toml_serializer.py + tests/dicts/io/test_io_dict_toml.py 21 passed, 1 skipped (skip is correct — tomllib available on Py3.11+).
|
@copilot resolve the merge conflicts in this pull request |
|
@fabiocaccamo it's been around 4 days since the @copilot resolve directive and the conflicts on |
|
@ricaskew yes please, go ahead and push, that would unblock things. Thanks for keeping the branch clean. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #566 +/- ##
=======================================
Coverage 98.32% 98.32%
=======================================
Files 64 64
Lines 2389 2392 +3
=======================================
+ Hits 2349 2352 +3
Misses 40 40
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…erage The `except ModuleNotFoundError` bodies for the new `tomli` and `tomli_w` import blocks are unreachable in CI (both packages are pinned in requirements.txt across the matrix), and they fall back to assignment statements with no behavior to exercise. Mirror the convention the file already used pre-PR for the removed `toml` import block: mark each branch with `# pragma: no cover`. Brings codecov/patch coverage from 73.33% (11/15) to 100% (11/11). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
This push (c2cf3fe) addresses the codecov/patch gate. The two new Pre-commit passes locally on the touched file. Re-requesting review. |
The uiri/toml encoder raises IndexError on strings containing certain characters (notably real control characters like chr(27)/ANSI escape). Issue #439 identified this class of bugs and proposed switching to tomli-w. Decode was migrated to stdlib tomllib on 3.11+ previously, but encode still routed through toml.dumps and still crashed.
Reproducer on main:
Change: TOMLSerializer.encode() now calls tomli_w.dumps(). Decode path untouched (tomllib on 3.11+, toml on 3.10). toml stays in the [toml] extra guarded by python_version < '3.11' for the 3.10 decode fallback; tomli-w is added unconditionally for encode.
Regression tests cover:
tests/serializers/test_toml_serializer.py replaces the prior TODO stubs with 5 real tests. test_io_dict_toml's "extra not installed" test patches tomli_w_installed (the encode dependency) instead of toml_installed.
API note: tomli_w.dumps kwargs differ from toml.dumps (no
encoder=param; gainsmultiline_stringsandindent). Callers of.to_toml(**kwargs)passing uiri-specific kwargs will hit TypeError and should migrate to tomli-w's kwarg surface.Full suite: 800 tests pass (1 pre-existing skip).
Fixes the encode-side failure mode documented in #439.
name: Pull request
about: Submit a pull request for this project
assignees: fabiocaccamo
Describe your changes
?
Related issue
?
Checklist before requesting a review